REF: Base class for all extension tests #19863

TomAugspurger · 2018-02-23T12:09:25Z

Added the staticmethods

assert_series_equal
assert_frame_equal

to the base class. Useful for test cases like DecimalArray with NaNs,
since our assert_*_equal methods don't properly handle Decimal NaNs.

tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff

Added the staticmethods - assert_series_equal - assert_frame_equal to the base class. Useful for test cases like DecimalArray with NaNs, since our assert_*_equal methods don't properly handle Decimal NaNs.

jreback

why are we advertising this kind of api for testing? when we always use tm.assert_*? this is too easy to get wrong. wouldn't it be better to use that as the main one (e.g. make them EA aware)?

jreback · 2018-02-23T12:28:49Z

pandas/tests/extension/base/constructors.py

@@ -4,8 +4,10 @@
 import pandas.util.testing as tm


ideally we don't / shouldn't import tm in any of the subclasses of EA for testing (otherwise easy to accidently NOT use self.assert_*

It's still used for things like assert_raises_regex. But maybe we could import them explicitly (from pandas.util.testing import ...

jorisvandenbossche

Looks good to me

jorisvandenbossche · 2018-02-23T12:35:14Z

pandas/tests/extension/base/__init__.py

+
+All the tests in these modules use ``self.assert_frame_equal`` or
+``self.assert_series_equal`` for dataframe or series comparisons. By default,
+they use the usual ``pandas.util.testing.assert_frame_equal`` and


pandas.util.testing.assert_frame_equal -> pandas.testing.assert_frame_equal (that is the public API path)

jorisvandenbossche · 2018-02-23T12:37:14Z

pandas/tests/extension/base/constructors.py

@@ -4,8 +4,10 @@
 import pandas.util.testing as tm


It's still used for things like assert_raises_regex. But maybe we could import them explicitly (from pandas.util.testing import ...

TomAugspurger · 2018-02-23T12:38:33Z

wouldn't it be better to use that as the main one (e.g. make them EA aware)?

I thought about doing that for the test cases we've written. Decimal('NaN') is the problem I've hit since we don't compare NaNs "correctly". But I don't think it'd be worthwhile putting in fixes for those specific ones since,

1.) It's only solving the problem for our specific functions. 3rd party libraries writing EAs would have no way to fix it.
2.) It's complicating our assert_*_equal methods for very little gain.

(otherwise easy to accidently NOT use self.assert_*

Yes, this is going to be a problem going forward. I'm not sure how best to avoid it, but avoiding a tm import is a good start.

TomAugspurger · 2018-02-23T12:39:29Z

I could also add a linting rule that looks for tm.assert* in these submodules. Thoughts?

Edit: ahh that may be hard since in a followup I define assert_*_equal for DecimalArray, but it uses tm.assert_*_ internally.

jorisvandenbossche · 2018-02-23T12:39:42Z

why are we advertising this kind of api for testing? when we always use tm.assert_*? this is too easy to get wrong. wouldn't it be better to use that as the main one (e.g. make them EA aware)?

It's only for testing extension arrays (so for developers), not public API for users.
The problem is that we cannot know in general how an extension array will be laid out internally, so we cannot make our test methods fully aware of them.

Although, one alternative would be to define a equals method on ExtensionArray that deals with NaNs.

TomAugspurger · 2018-02-23T12:46:11Z

Although, one alternative would be to define a equals method on ExtensionArray that deals with NaNs.

And then just call .equals on it in assert_series_equal if they're an extension dtype? I could get behind that.

jorisvandenbossche · 2018-02-23T12:53:35Z

I am only not sure I am fully convinced myself :). The problem with and .equals method like that, it that it does not leave the space for an element-wise equals method. But of course with Series.equals already there, it might also be confusing that add such a method as an extension author.

TomAugspurger · 2018-02-23T14:17:14Z

does not leave the space for an element-wise equals method

I think following the lead of Categorical is best here. .equals for returning True/False and .__eq__ for elementwise.

That said, .equals isn't great for the assert_*_equal methods since we want an informative traceback.

I should be able to add a linting rule that looks for tm.assert_*_equal only the the extension/base module. Then we're free to use it in extension/decimal.py

codecov · 2018-02-23T14:23:40Z

Codecov Report

❗ No coverage uploaded for pull request base (master@01e99de). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff            @@
##             master   #19863   +/-   ##
=========================================
  Coverage          ?   91.64%           
=========================================
  Files             ?      150           
  Lines             ?    48946           
  Branches          ?        0           
=========================================
  Hits              ?    44858           
  Misses            ?     4088           
  Partials          ?        0

Flag	Coverage Δ
#multiple	`90.02% <ø> (?)`
#single	`41.81% <ø> (?)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 01e99de...e0164a4. Read the comment docs.

TomAugspurger · 2018-02-23T14:24:27Z

OK, this should now fail if tests in extension/base/*.py use tm.assert_(frame|series)_equal. I've excluded base/base.py since that's where the defaults are set.

jorisvandenbossche · 2018-02-23T14:37:28Z

I think following the lead of Categorical is best here. .equals for returning True/False and .eq for elementwise.

But, __eq__ has no flexibilty regarding NaNs, so that's a reason one would want a element-wise equals (and from that, you can always do a .all() to get a single boolean). Anyhow, the existing behaviour is of course there, but if I would design it from scratch, not sure if I would do equals as it is now.

jorisvandenbossche · 2018-02-23T14:38:41Z

But for this PR, I would personally for now do like you did here. We can later still discuss if we want an equals method or not.

jreback · 2018-02-24T15:19:22Z

thanks!

REF: Base class for all extension tests

6d5f5ac

Added the staticmethods - assert_series_equal - assert_frame_equal to the base class. Useful for test cases like DecimalArray with NaNs, since our assert_*_equal methods don't properly handle Decimal NaNs.

TomAugspurger added Refactor Internal refactoring of code Testing pandas testing functions or related to the test suite Internals Related to non-user accessible pandas implementation labels Feb 23, 2018

TomAugspurger added this to the 0.23.0 milestone Feb 23, 2018

TomAugspurger mentioned this pull request Feb 23, 2018

ExtensionArray meta-issue #19696

Closed

15 tasks

jreback requested changes Feb 23, 2018

View reviewed changes

jorisvandenbossche reviewed Feb 23, 2018

View reviewed changes

Check for uses of tm.assert_ in extension tests.

e0164a4

TomAugspurger mentioned this pull request Feb 23, 2018

ENH: ExtensionArray.unique #19869

Merged

jreback approved these changes Feb 24, 2018

View reviewed changes

jreback merged commit e362281 into pandas-dev:master Feb 24, 2018

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018

REF: Base class for all extension tests (pandas-dev#19863)

faf595e

TomAugspurger deleted the fu1+base-test-class branch May 2, 2018 13:10

Uh oh!

REF: Base class for all extension tests #19863

REF: Base class for all extension tests #19863

Uh oh!

Conversation

TomAugspurger commented Feb 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

jreback Feb 23, 2018

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche Feb 23, 2018

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche Feb 23, 2018

Choose a reason for hiding this comment

Uh oh!

jorisvandenbossche Feb 23, 2018

Choose a reason for hiding this comment

Uh oh!

TomAugspurger commented Feb 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TomAugspurger commented Feb 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jorisvandenbossche commented Feb 23, 2018

Uh oh!

TomAugspurger commented Feb 23, 2018

Uh oh!

jorisvandenbossche commented Feb 23, 2018

Uh oh!

TomAugspurger commented Feb 23, 2018

Uh oh!

codecov bot commented Feb 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TomAugspurger commented Feb 23, 2018

Uh oh!

jorisvandenbossche commented Feb 23, 2018

Uh oh!

jorisvandenbossche commented Feb 23, 2018

Uh oh!

jreback commented Feb 24, 2018

Uh oh!

Uh oh!

TomAugspurger commented Feb 23, 2018 •

edited

Loading

TomAugspurger commented Feb 23, 2018 •

edited

Loading

TomAugspurger commented Feb 23, 2018 •

edited

Loading

codecov bot commented Feb 23, 2018 •

edited

Loading